The ICSI 2007 language recognition system

نویسندگان

Christian A. Müller

Joan-Isaac Biel

چکیده

In this paper, we describe the ICSI 2007 language recognition system. The system constitutes a variant of the classic PPRLM (parallel phone recognizer followed by language modeling) approach. We used a combination of frame-by-frame multilayer perceptron (MLP) phone classifiers for English, Arabic, and Mandarin and one open loop hidden Markov Model (HMM) phone recognizer (trained on English data). The maximum likelihood language modeling is substituted by support-vectormachines (SVMs) as a more powerful, discriminative classification method. Rank normalization is used as a normalization method superior to mean-variance normalization. Results are presented on the NIST 2005 language recognition evaluation (LRE05) set and a test set taken from the LRE07 training corpus. The average NIST cost of the system on the LRE05 set is 0.0886.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Advances in Mandarin broadcast speech recognition

We describe our continuing efforts to improve the UW-SRI-ICSI Mandarin broadcast speech recognizer. This includes increasing acoustic and text training data, adding discriminative features, incorporating frame-level discriminative training criterion, multiplepass acoustic model (AM) cross adaptation, language model (LM) genre adaptation and system combination. The net effect without LM adaptati...

متن کامل

Language Modeling in the ICSI-SRI Spring 2005 Meeting Speech Recognition Evaluation System

In this report, we describe the language models (LMs) used in the ICSI-SRI system for the NIST Spring 2005 Meeting Rich Transcription (RT-05S) evaluation. Our LMs are linear interpolations of n-gram models trained on a small number of in-domain sources and a large number of out-of-domain sources, which include conference proceedings and newly collected web data, in addition to other commonly-us...

متن کامل

Speaker adaptation of language models for automatic dialog act segmentation of meetings

Dialog act (DA) segmentation in meeting speech is important for meeting understanding. In this paper, we explore speaker adaptation of hidden event language models (LMs) for DA segmentation using the ICSI Meeting Corpus. Speaker adaptation is performed using a linear combination of the generic speakerindependent LM and an LM trained on only the data from individual speakers. We test the method ...

متن کامل

The SRI-ICSI Spring 2007 Meeting and Lecture Recognition System

We describe the latest version of the SRI-ICSI meeting and lecture recognition system, as was used in the NIST RT-07 evaluations, highlighting improvements made over the last year. Changes in the acoustic preprocessing include updated beamforming software for processing of multiple distant microphones, and various adjustments to the speech segmenter for close-talking microphones. Acoustic model...

متن کامل

Using Prosodic Features in Language Models for Meetings

Prosody has been actively studied as an important knowledge source for speech recognition and understanding. In this paper, we are concerned with the question of exploiting prosody for language models to aid automatic speech recognition in the context of meetings. Using an automatic syllable detection algorithm, the syllable-based prosodic features are extracted to form the prosodic representat...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2008

The ICSI 2007 language recognition system

نویسندگان

چکیده

منابع مشابه

Advances in Mandarin broadcast speech recognition

Language Modeling in the ICSI-SRI Spring 2005 Meeting Speech Recognition Evaluation System

Speaker adaptation of language models for automatic dialog act segmentation of meetings

The SRI-ICSI Spring 2007 Meeting and Lecture Recognition System

Using Prosodic Features in Language Models for Meetings

عنوان ژورنال:

اشتراک گذاری